{"componentChunkName":"component---src-templates-blog-tsx","path":"/blog/Touching-up-a-Data-Plot/","result":{"data":{"markdownRemark":{"html":"<p>In this tutorial I'm going to show you how to touch up a generic scatter plot made with the Matplotlib package in Python. You should be fairly familiar with Python and have a basic knowledge about how to plot something with Matplotlib.</p>\n<h2>Looking at the Data</h2>\n<p>The data to be visualized are the predictions of a neural net trying to guess the happiness score of a country by looking at some data points. As for the workbench, I set up <a href=\"https://jupyter.org/\">Jupyter Lab</a>.</p>\n<p>Using the Pandas package, we load a <code class=\"language-text\">DataFrame</code> (basically a data table) from the <code class=\"language-text\">pickle</code> file provided in the <a href=\"Touching-up-a-Data-Plot.zip\">source materials</a>. For our convenience, we save references of the columns of interest. In most cases, you can treat these references like lists of numbers. So don't worry about the underlying magic.</p>\n<div class=\"gatsby-highlight\" data-language=\"python\"><pre class=\"language-python\"><code class=\"language-python\"><span class=\"token keyword\">import</span> pandas <span class=\"token keyword\">as</span> pd\n\ndata <span class=\"token operator\">=</span> pd<span class=\"token punctuation\">.</span>read_pickle<span class=\"token punctuation\">(</span> <span class=\"token string\">'predicted.pickle'</span> <span class=\"token punctuation\">)</span>\n\nscores <span class=\"token operator\">=</span> data<span class=\"token punctuation\">[</span><span class=\"token string\">\"Score\"</span><span class=\"token punctuation\">]</span>\npredictions <span class=\"token operator\">=</span> data<span class=\"token punctuation\">[</span><span class=\"token string\">\"Score prediction\"</span><span class=\"token punctuation\">]</span>\nerrors <span class=\"token operator\">=</span> data<span class=\"token punctuation\">[</span><span class=\"token string\">\"Score error\"</span><span class=\"token punctuation\">]</span></code></pre></div>\n<p><code class=\"language-text\">scores</code> are the actual happiness scores of a country from the original dataset, having a range from 0 to 10. <code class=\"language-text\">predictions</code> are the predicted happiness scores from my neural net. <code class=\"language-text\">errors</code> represent how much do the actual values differ from the predicted ones. In this case they are the <em>squared errors</em>.</p>\n<h2>Styling the Axis and Frame</h2>\n<p>Let's start by plotting the dataset using Matplotlib. We also indicate the error (deviation between the actual and predicted happiness score) by a colour and show a corresponding <em>colorbar</em> on the right side.</p>\n<div class=\"gatsby-highlight\" data-language=\"python\"><pre class=\"language-python\"><code class=\"language-python\"><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span>\n\n<span class=\"token keyword\">import</span> matplotlib<span class=\"token punctuation\">.</span>pyplot <span class=\"token keyword\">as</span> plt\n\nfig<span class=\"token punctuation\">,</span> ax <span class=\"token operator\">=</span> plt<span class=\"token punctuation\">.</span>subplots<span class=\"token punctuation\">(</span>nrows<span class=\"token operator\">=</span><span class=\"token number\">1</span><span class=\"token punctuation\">,</span> ncols<span class=\"token operator\">=</span><span class=\"token number\">1</span><span class=\"token punctuation\">)</span>\nscatter <span class=\"token operator\">=</span> ax<span class=\"token punctuation\">.</span>scatter<span class=\"token punctuation\">(</span>scores<span class=\"token punctuation\">,</span> predictions<span class=\"token punctuation\">,</span> c<span class=\"token operator\">=</span>errors<span class=\"token punctuation\">)</span>\nfig<span class=\"token punctuation\">.</span>colorbar<span class=\"token punctuation\">(</span>scatter<span class=\"token punctuation\">,</span> ax<span class=\"token operator\">=</span>ax<span class=\"token punctuation\">)</span>\n\nfig<span class=\"token punctuation\">.</span>show<span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span></code></pre></div>\n<span class=\"gatsby-pri-wrapper\"><img class=\"gatsby-pri-image gatsby-pri-image-svg\" src=\"01_primitive_plot.svg\" alt=\"01_primitive_plot\"></span>\n<p>Because the predicted and the actual scores share the same range we opt for equal scales. Thus, we set the limits of the y-axis (<code class=\"language-text\">ylim</code>) to the limits of the x-axis (<code class=\"language-text\">xlim</code>). To visually emphasize this equality, we force the diagram to be of the shape of a square.</p>\n<div class=\"gatsby-highlight\" data-language=\"python\"><pre class=\"language-python\"><code class=\"language-python\"><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span>\n\n<span class=\"token comment\"># make diagram square</span>\nax<span class=\"token punctuation\">.</span>axis<span class=\"token punctuation\">(</span><span class=\"token string\">'square'</span><span class=\"token punctuation\">)</span>\n\n<span class=\"token comment\"># make axis use same limits</span>\nax<span class=\"token punctuation\">.</span>set_ylim<span class=\"token punctuation\">(</span>ax<span class=\"token punctuation\">.</span>get_xlim<span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span>\n\nfig<span class=\"token punctuation\">.</span>show<span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span></code></pre></div>\n<span class=\"gatsby-pri-wrapper\"><img class=\"gatsby-pri-image gatsby-pri-image-svg\" src=\"02_equal_and_square.svg\" alt=\"Square plot with equal axis limits\"></span>\n<p>Changing the background is done by calling <code class=\"language-text\">ax.set_facecolor(...)</code>. We can actually pass an hex value to it. Easy as pie! To get rid of the frame we need to get the top, right, bottom, and left spine and make is invisible, e.g. <code class=\"language-text\">ax.spines[&#39;top&#39;].set_visible(False)</code>.</p>\n<div class=\"gatsby-highlight\" data-language=\"python\"><pre class=\"language-python\"><code class=\"language-python\"><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span>\n\n<span class=\"token comment\"># set the background color</span>\nax<span class=\"token punctuation\">.</span>set_facecolor<span class=\"token punctuation\">(</span><span class=\"token string\">'#efeff1'</span><span class=\"token punctuation\">)</span>\n\n<span class=\"token comment\"># remove frame</span>\nax<span class=\"token punctuation\">.</span>spines<span class=\"token punctuation\">[</span><span class=\"token string\">'top'</span><span class=\"token punctuation\">]</span><span class=\"token punctuation\">.</span>set_visible<span class=\"token punctuation\">(</span><span class=\"token boolean\">False</span><span class=\"token punctuation\">)</span>\nax<span class=\"token punctuation\">.</span>spines<span class=\"token punctuation\">[</span><span class=\"token string\">'right'</span><span class=\"token punctuation\">]</span><span class=\"token punctuation\">.</span>set_visible<span class=\"token punctuation\">(</span><span class=\"token boolean\">False</span><span class=\"token punctuation\">)</span>\nax<span class=\"token punctuation\">.</span>spines<span class=\"token punctuation\">[</span><span class=\"token string\">'bottom'</span><span class=\"token punctuation\">]</span><span class=\"token punctuation\">.</span>set_visible<span class=\"token punctuation\">(</span><span class=\"token boolean\">False</span><span class=\"token punctuation\">)</span>\nax<span class=\"token punctuation\">.</span>spines<span class=\"token punctuation\">[</span><span class=\"token string\">'left'</span><span class=\"token punctuation\">]</span><span class=\"token punctuation\">.</span>set_visible<span class=\"token punctuation\">(</span><span class=\"token boolean\">False</span><span class=\"token punctuation\">)</span>\n\nfig<span class=\"token punctuation\">.</span>show<span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span></code></pre></div>\n<span class=\"gatsby-pri-wrapper\"><img class=\"gatsby-pri-image gatsby-pri-image-svg\" src=\"03_no_frame.svg\" alt=\"Plot without frame\"></span>\n<p>Next, we are going to plot a white grid via <code class=\"language-text\">ax.grid(...)</code>. Unfortunately, the grid is always on top by default. Thus, we need to call <code class=\"language-text\">ax.set_axisbelow(True)</code> to sent the axes and the grid to the background.</p>\n<div class=\"gatsby-highlight\" data-language=\"python\"><pre class=\"language-python\"><code class=\"language-python\"><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span>\n\n<span class=\"token comment\"># draw grid</span>\nax<span class=\"token punctuation\">.</span>grid<span class=\"token punctuation\">(</span>which<span class=\"token operator\">=</span><span class=\"token string\">\"major\"</span><span class=\"token punctuation\">,</span> color<span class=\"token operator\">=</span><span class=\"token string\">\"w\"</span><span class=\"token punctuation\">,</span> linestyle<span class=\"token operator\">=</span><span class=\"token string\">'-'</span><span class=\"token punctuation\">,</span> linewidth<span class=\"token operator\">=</span><span class=\"token number\">1</span><span class=\"token punctuation\">)</span>\n\n<span class=\"token comment\"># sent axes and grid to the back</span>\nax<span class=\"token punctuation\">.</span>set_axisbelow<span class=\"token punctuation\">(</span><span class=\"token boolean\">True</span><span class=\"token punctuation\">)</span>\n\nfig<span class=\"token punctuation\">.</span>show<span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span></code></pre></div>\n<span class=\"gatsby-pri-wrapper\"><img class=\"gatsby-pri-image gatsby-pri-image-svg\" src=\"04_grid.svg\" alt=\"Plot with grid\"></span>\n<p>As you can see above, the labelled tick marks at the axes are not matching the rest of the design. Thus, we are going to adjust their colour and length by using <code class=\"language-text\">ax.tick_params()</code>.</p>\n<div class=\"gatsby-highlight\" data-language=\"python\"><pre class=\"language-python\"><code class=\"language-python\"><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span>\n\n<span class=\"token comment\"># set color and width of tick marks (and of the labels)</span>\nax<span class=\"token punctuation\">.</span>tick_params<span class=\"token punctuation\">(</span>axis<span class=\"token operator\">=</span><span class=\"token string\">'x'</span><span class=\"token punctuation\">,</span> colors<span class=\"token operator\">=</span><span class=\"token string\">'#777777'</span><span class=\"token punctuation\">,</span> width<span class=\"token operator\">=</span><span class=\"token number\">0</span><span class=\"token punctuation\">)</span>\nax<span class=\"token punctuation\">.</span>tick_params<span class=\"token punctuation\">(</span>axis<span class=\"token operator\">=</span><span class=\"token string\">'y'</span><span class=\"token punctuation\">,</span> colors<span class=\"token operator\">=</span><span class=\"token string\">'#777777'</span><span class=\"token punctuation\">,</span> width<span class=\"token operator\">=</span><span class=\"token number\">0</span><span class=\"token punctuation\">)</span>\n\nfig<span class=\"token punctuation\">.</span>show<span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span></code></pre></div>\n<span class=\"gatsby-pri-wrapper\"><img class=\"gatsby-pri-image gatsby-pri-image-svg\" src=\"05_nicer_axis.svg\" alt=\"Plot with improved tick marks\"></span>\n<h2>Improving the Colorbar</h2>\n<p>To adjust the colorbar, we have to dig deeper. We have to retrieve a reference for it by saving the output of <code class=\"language-text\">fig.colorbar(...)</code> as <code class=\"language-text\">colorbar</code>.  Now we can remove its frame by gaining access to its outline and set the visibility via <code class=\"language-text\">colorbar.outline.set_visible(...)</code>.</p>\n<p>The biggest challenge is to change the appearance of the tick marks and labels. I found this to be documented poorly. The key is that the object <code class=\"language-text\">colorbar</code> has a member <code class=\"language-text\">ax</code> which contains another member <code class=\"language-text\">axes</code> allowing you to make changes to the axes of the colorbar itself. Thus, we can configure the tick marks like any other figure by calling <code class=\"language-text\">color.ax.axes.tick_params(...)</code>.</p>\n<p>However, to change the colour of the tick labels we must must first get a handle for the <code class=\"language-text\">yticklabels</code> via <code class=\"language-text\">colorbar.ax.axes.get_yticklabels()</code>.  Second, we use the function  <code class=\"language-text\">plt.setp(...)</code> to set a specific attribute. In this case we are changing the colour of our tick labels as you can see below.</p>\n<div class=\"gatsby-highlight\" data-language=\"python\"><pre class=\"language-python\"><code class=\"language-python\"><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span>\n\n<span class=\"token keyword\">import</span> matplotlib<span class=\"token punctuation\">.</span>pyplot <span class=\"token keyword\">as</span> plt\n\nfig<span class=\"token punctuation\">,</span> ax <span class=\"token operator\">=</span> plt<span class=\"token punctuation\">.</span>subplots<span class=\"token punctuation\">(</span>nrows<span class=\"token operator\">=</span><span class=\"token number\">1</span><span class=\"token punctuation\">,</span> ncols<span class=\"token operator\">=</span><span class=\"token number\">1</span><span class=\"token punctuation\">)</span>\nscatter <span class=\"token operator\">=</span> ax<span class=\"token punctuation\">.</span>scatter<span class=\"token punctuation\">(</span>scores<span class=\"token punctuation\">,</span> predictions<span class=\"token punctuation\">,</span> c<span class=\"token operator\">=</span>errors<span class=\"token punctuation\">)</span>\n\n<span class=\"token comment\"># get reference of colorbar</span>\ncolorbar <span class=\"token operator\">=</span> fig<span class=\"token punctuation\">.</span>colorbar<span class=\"token punctuation\">(</span>scatter<span class=\"token punctuation\">,</span> ax<span class=\"token operator\">=</span>ax<span class=\"token punctuation\">)</span>\n\n<span class=\"token comment\"># remove frame</span>\ncolorbar<span class=\"token punctuation\">.</span>outline<span class=\"token punctuation\">.</span>set_visible<span class=\"token punctuation\">(</span><span class=\"token boolean\">False</span><span class=\"token punctuation\">)</span>\n\n<span class=\"token comment\"># set color of ticks</span>\ncolorbar<span class=\"token punctuation\">.</span>ax<span class=\"token punctuation\">.</span>axes<span class=\"token punctuation\">.</span>tick_params<span class=\"token punctuation\">(</span>axis<span class=\"token operator\">=</span><span class=\"token string\">'y'</span><span class=\"token punctuation\">,</span> color<span class=\"token operator\">=</span><span class=\"token string\">'#777777'</span><span class=\"token punctuation\">)</span>\n\n<span class=\"token comment\"># get reference of tick marks</span>\nyticklabels <span class=\"token operator\">=</span> colorbar<span class=\"token punctuation\">.</span>ax<span class=\"token punctuation\">.</span>axes<span class=\"token punctuation\">.</span>get_yticklabels<span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span>\n\n<span class=\"token comment\"># set color attribute of tick marks</span>\nplt<span class=\"token punctuation\">.</span>setp<span class=\"token punctuation\">(</span>yticklabels<span class=\"token punctuation\">,</span> color<span class=\"token operator\">=</span><span class=\"token string\">'#777777'</span><span class=\"token punctuation\">)</span>\n\n<span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span>\n\nfig<span class=\"token punctuation\">.</span>show<span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span></code></pre></div>\n<span class=\"gatsby-pri-wrapper\"><img class=\"gatsby-pri-image gatsby-pri-image-svg\" src=\"06_improved_colorbar.svg\" alt=\"Plot with improved colorbar\"></span>\n<h2>Adjusting the Colormap</h2>\n<p>Doesn't this look way prettier than Matplotlib's default settings? However, the blue-to-yellow colour gradient used to indicate the errors might not work for you. Of course, there is a wide range of colour gradients (so called <em>colormaps</em>) to choose from which already come with the Matplotlib package. But sometimes, this isn't enough, for example, if your plot has to comply to the style guide of your company.</p>\n<p>I'm going to show you an easy method how to brew your own colormap, like you would do in PowerPoint or Adobe Illustrator.</p>\n<p>Basically, you want to have a linear colour gradient between two or more colours. In this case we can use <code class=\"language-text\">LinearSegmentedColormap.from_list(...)</code> from the package <code class=\"language-text\">matplotlib.colors</code>.</p>\n<div class=\"gatsby-highlight\" data-language=\"python\"><pre class=\"language-python\"><code class=\"language-python\">LinearSegmentedColormap<span class=\"token punctuation\">.</span>from_list<span class=\"token punctuation\">(</span>name<span class=\"token operator\">=</span><span class=\"token string\">'greyToRed'</span><span class=\"token punctuation\">,</span> segmentdata<span class=\"token operator\">=</span><span class=\"token punctuation\">[</span><span class=\"token string\">'#555566'</span><span class=\"token punctuation\">,</span> <span class=\"token string\">'#F00A41'</span><span class=\"token punctuation\">]</span><span class=\"token punctuation\">,</span> N<span class=\"token operator\">=</span><span class=\"token number\">256</span><span class=\"token punctuation\">)</span></code></pre></div>\n<p>As you can see, the function expects a <code class=\"language-text\">name</code> and an array of colours as <code class=\"language-text\">segmentdata</code> describing your gradient. In this case, our gradient starts at <code class=\"language-text\">#555566</code> and stops at <code class=\"language-text\">#F00A41</code>. The parameter <code class=\"language-text\">N</code> specifies how many colour steps there will be in-between. The more steps the smoother the gradient.  Below you can see a small discrete sample of our colormap.</p>\n<span class=\"gatsby-pri-wrapper\"><img class=\"gatsby-pri-image gatsby-pri-image-svg\" src=\"07_colormap.svg\" alt=\"Home brewed Colormap\"></span>\n<p>To use the gradient, we will go back in our source code, insert the missing <code class=\"language-text\">import</code> statement, create a colormap called <code class=\"language-text\">cmap</code>, and modify the call to the <code class=\"language-text\">ax.scatter</code> function by passing the colormap as <code class=\"language-text\">cmap</code>.</p>\n<div class=\"gatsby-highlight\" data-language=\"python\"><pre class=\"language-python\"><code class=\"language-python\"><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span>\n\n<span class=\"token keyword\">import</span> matplotlib<span class=\"token punctuation\">.</span>pyplot <span class=\"token keyword\">as</span> plt\n<span class=\"token keyword\">from</span> matplotlib<span class=\"token punctuation\">.</span>colors <span class=\"token keyword\">import</span> LinearSegmentedColormap\n\n<span class=\"token comment\"># create linear colour gradient</span>\ncmap <span class=\"token operator\">=</span> LinearSegmentedColormap<span class=\"token punctuation\">.</span>from_list<span class=\"token punctuation\">(</span><span class=\"token string\">'greyToRed'</span><span class=\"token punctuation\">,</span> <span class=\"token punctuation\">[</span><span class=\"token string\">'#555566'</span><span class=\"token punctuation\">,</span> <span class=\"token string\">'#F00A41'</span><span class=\"token punctuation\">]</span><span class=\"token punctuation\">,</span> N<span class=\"token operator\">=</span><span class=\"token number\">256</span><span class=\"token punctuation\">)</span>\n\nfig<span class=\"token punctuation\">,</span> ax <span class=\"token operator\">=</span> plt<span class=\"token punctuation\">.</span>subplots<span class=\"token punctuation\">(</span>nrows<span class=\"token operator\">=</span><span class=\"token number\">1</span><span class=\"token punctuation\">,</span> ncols<span class=\"token operator\">=</span><span class=\"token number\">1</span><span class=\"token punctuation\">)</span>\nscatter <span class=\"token operator\">=</span> ax<span class=\"token punctuation\">.</span>scatter<span class=\"token punctuation\">(</span>scores<span class=\"token punctuation\">,</span> predictions<span class=\"token punctuation\">,</span> c<span class=\"token operator\">=</span>errors<span class=\"token punctuation\">,</span> cmap<span class=\"token operator\">=</span>cmap<span class=\"token punctuation\">)</span>\n\n<span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span>\n\nfig<span class=\"token punctuation\">.</span>show<span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span></code></pre></div>\n<span class=\"gatsby-pri-wrapper\"><img class=\"gatsby-pri-image gatsby-pri-image-svg\" src=\"08_result.svg\" alt=\"Touched up plot\"></span>\n<p>Above you can see the final result of our efforts.<sup id=\"fnref-1\"><a href=\"#fn-1\" class=\"footnote-ref\">1</a></sup> I hope you enjoyed this tutorial on how to touch up plots in Python. Feel free to download the <a href=\"Touching-up-a-Data-Plot.zip\">source code</a> and try it out yourself!</p>\n\n      <div class=\"footnotes\">\n        <hr/>\n        <ol >\n    \n    <li class=\"footnote-list-item\" id=\"fn-1\" >\n          \n        \n      <a href=\"#fnref-1\" class=\"footnote-backref\" style=\"display:inline;text-decoration: none;\">\n        ^\n      </a>\n    <p class=\"footnote-paragraph\" style=\"display:inline; margin-left: 5px;\">Note that I tweaked the limits of the axes by setting ax.set_xlim([3.5, 8.5])</p>\n      </li>\n      \n    </ol></div>","fields":{"slug":"/blog/Touching-up-a-Data-Plot/"},"frontmatter":{"title":"Touching up a Data Plot","date":"2020-04-19T00:00:00.000Z","short":"Touching up a generic data plot in Python made with Matplotlib to transform it into a professional looking visualization.","subtitle":null,"medium":null,"keywords":null}}},"pageContext":{"slug":"/blog/Touching-up-a-Data-Plot/"}},"staticQueryHashes":[]}